Record: GDN-Hybrid + Sliding Window Attention (cold-cache, 1.01710 BPB)#1564
Closed
joshkmartinez wants to merge 4 commits intoopenai:mainfrom
Closed
Record: GDN-Hybrid + Sliding Window Attention (cold-cache, 1.01710 BPB)#1564joshkmartinez wants to merge 4 commits intoopenai:mainfrom
joshkmartinez wants to merge 4 commits intoopenai:mainfrom
Conversation
sunnypatneedi
pushed a commit
to sunnypatneedi/parameter-golf
that referenced
this pull request
Apr 12, 2026
…1.01710 Merged SOTA changed from 1.1147 to 1.0810 (PR openai#1493, bigbag, 2026-04-09). Six PRs merged in 5 days (PRs openai#1334, openai#1285, openai#1394, openai#1412, openai#1413, openai#1477, openai#1493). New target: ≤1.0760 val_bpb. 18 days to deadline. Key findings: - GDN-Hybrid (PR openai#1564): 1.01710 BPB, no TTT/SLOT — monitor for organizer review - VarLen Attention + Doc-TTT (PR openai#1560): 1.07406 BPB — implement next - TMA Megakernel + Tap-In (PR openai#1555): 1.07636 BPB — add after openai#1560 - PR openai#731 n-gram (dense count + Laplace): reviewer says LOOKS CLEAN, awaiting 3rd seed - PR openai#758: major legality flags, do not implement Updated CLAUDE.md: Competition Strategy, Technique Reference, Lessons Learned (Session 9). Updated logs/daily_research.md: new 2026-04-12 entry prepended. https://claude.ai/code/session_011WyxjcwdigLhMFQDjLL5ss
Author
|
Superseded by PR #1575, which stages the stronger run051-safe031 SAFE_SUBMISSION artifact (1.01671233 BPB, all seeds under cap). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
3-Seed Results
Architecture
This submission uses an SP1024-tokenized GDN-Hybrid backbone with the following high-level structure:
[GDN×5] → SWA → [GDN×5] → SWA_shared
Key components:
Legality
This is a SAFE_SUBMISSION / Track-A fixed-predictor result. The scored artifact uses no TTT, no SLOT, and no eval-time adaptation. All three pulled artifacts are under the 16,000,000-byte cap.
Credits